kNN at TREC-9
نویسندگان
چکیده
We applied a multi-class k-nearest-neighbor based text classiication algorithm to the adap-tive and batch ltering problems in the TREC-9 ltering track. While our systems performed well in the batch ltering tasks, they did not perform as well in the adaptive ltering tasks, in part because we did not have an adequate mechanism for taking advantage of the relevance feedback information provided by the l-tering tasks. Since TREC-9, we have made considerable improvements in our batch ltering results and discovered some serious problems with both the T9P and T9U metrics. In this paper, we discuss these issues and their impact on our ltering results.
منابع مشابه
Recommending Points-of-Interest via Weighted kNN, Rated Rocchio, and Borda Count Fusion
We present the work of the Democritus University of Thrace (DUTH) team in TREC’s 2016 Contextual Suggestion Track. The goal of the Contextual Suggestion Track is to build a system capable of proposing venues which a user might be interested to visit, using any available contextual and personal information. First, we enrich the TREC-provided dataset by collecting more information on venues from ...
متن کاملYork University at TREC 2005: SPAM Track
We propose a variant of the k-nearest neighbor classification method, called instance-weighted k-nearest neighbor method, for adaptive spam filtering. The method assigns two weights, distance weight and correctness weight, to a training instance, and makes use of the two weights when classifying a new email. The correctness weight is also used in the maintenance of the training data to make the...
متن کاملThe Thisl SDR System at TREC-9
This paper describes our participation in the TREC-9 Spoken Document Retrieval (SDR) track. The THISL SDR system consists of a realtime version of a hybrid connectionist/HMM large vocabulary speech recognition system and a probabilistic text retrieval system. This paper describes the configuration of the speech recognition and text retrieval systems, including segmentation and query expansion. ...
متن کاملTREC 2004 Genomics Track Experiments at IUB
This paper describes the methods we developed for the three tasks of the TREC Genomics Track, i.e., ad hoc retrieval, triage, and annotation tasks. For the ad hoc retrieval task, we used the classic vector space model and studied the use of query expansion and pseudorelevance feedback. Our submitted runs obtained a MAP of 0.183. For the triage task, we adopted a naı̈ve Bayes classifier trained o...
متن کامل